Reverse Engineering Aggregation Queries

نویسندگان

  • Wei Chit Tan
  • Meihui Zhang
  • Hazem Elmeleegy
  • Divesh Srivastava
چکیده

Query reverse engineering seeks to re-generate the SQL query that produced a given query output table from a given database. In this paper, we solve this problem for OLAP queries with group-by and aggregation. We develop a novel three-phase algorithm named REGAL 1 for this problem. First, based on a lattice graph structure, we identify a set of group-by candidates for the desired query. Second, we apply a set of aggregation constraints that are derived from the properties of aggregate operators at both the table-level and the group-level to discover candidate combinations of group-by columns and aggregations that are consistent with the given query output table. Finally, we find a multi-dimensional filter, i.e., a conjunction of selection predicates over the base table attributes, that is needed to generate the exact query output table. We conduct an extensive experimental study over the TPC-H dataset to demonstrate the effectiveness and efficiency of our proposal.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reverse Engineering Top-k Join Queries

Ranked lists have become a fundamental tool to represent the most important items taken from a large collection of data. Search engines, sports leagues and e-commerce platforms present their results, most successful teams and most popular items in a concise and structured way by making use of ranked lists. This paper introduces the PALEO-J framework which is able to reconstruct top-k database q...

متن کامل

Exploring Databases via Reverse Engineering Ranking Queries with PALEO

A novel approach to explore databases using ranked lists is demonstrated. Working with ranked lists, capturing the relative performance of entities, is a very intuitive and widely applicable concept. Users can post lists of entities for which explanatory SQL queries and full result lists are returned. By refining the input, the results, or the queries, user can interactively explore the databas...

متن کامل

A User Driven Method for Database Reverse Engineering

In this thesis we describe the UQoRE method which supports database reverse engineering by using a data mining technique. Generally, Reverse Engineering methods work by using information extracted from data dictionaries, database extensions, application programs and expert users. The main differences between all these methods rely on the assumptions made on the a-priori knowledge available abou...

متن کامل

Reverse Engineering Top-k Database Queries with PALEO

Ranked lists are an essential methodology to succinctly summarize outstanding items, computed over database tables or crowdsourced in dedicated websites. In this work, we address the problem of reverse engineering top-k queries over a database, that is, given a relation R and a sample topk result list, our approach, named PALEO, aims at determining an SQL query that returns the provided input r...

متن کامل

Reverse Engineering Using Graph Queries

Software Reverse Engineering is the process of extracting (usually more abstract) information from software artifacts. Graph-based engineering tools work on fact repositories that keep all artifacts as graphs. Hence, information extraction can be viewed as querying this repository. This paper describes the graph query language GReQL and its use in reverse engineering tools. GReQL is an expressi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PVLDB

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2017